2,471 research outputs found

    Entropy-scaling search of massive biological data

    Get PDF
    Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

    DMseg: a Python algorithm for de novo detection of differentially or variably methylated regions

    Full text link
    Detecting and assessing statistical significance of differentially methylated regions (DMRs) is a fundamental task in methylome association studies. While the average differential methylation in different phenotype groups has been the inferential focus, methylation changes in chromosomal regions may also present as differential variability, i.e., variably methylated regions (VMRs). Testing statistical significance of regional differential methylation is a challenging problem, and existing algorithms do not provide accurate type I error control for genome-wide DMR or VMR analysis. No algorithm has been publicly available for detecting VMRs. We propose DMseg, a Python algorithm with efficient DMR/VMR detection and significance assessment for array-based methylome data, and compare its performance to Bumphunter, a popular existing algorithm. Operationally, DMseg searches for DMRs or VMRs within CpG clusters that are adaptively determined by both gap distance and correlation between contiguous CpG sites in a microarray. Levene test was implemented for assessing differential variability of individual CpGs. A likelihood ratio statistic is proposed to test for a constant difference within CpGs in a DMR or VMR to summarize the evidence of regional difference. Using a stratified permutation scheme and pooling null distributions of LRTs from clusters with similar numbers of CpGs, DMseg provides accurate control of the type I error rate. In simulation experiments, DMseg shows superior power than Bumphunter to detect DMRs. Application to methylome data of Barrett's esophagus and esophageal adenocarcinoma reveals a number of DMRs and VMRs of biological interest

    Spectral classification of the brightest objects in the galactic star forming region W40

    Full text link
    We present high S/N, moderate resolution near-infrared spectra, as well as 10 micron imaging, for the brightest members of the central stellar cluster in the W40 HII region, obtained using the SpeX and MIRSI instruments at NASA's Infrared Telescope Facility. Using these observations combined with archival Spitzer Space Telescope data, we have determined the spectral classifications, extinction, distances, and spectral energy distributions for the brightest members of the cluster. Of the eight objects observed, we identify four main sequence (MS) OB stars, two Herbig Ae/Be stars, and two low-mass young stellar objects. Strong HeI absorption at 1.083 micron in the MS star spectra strongly suggests that at least some of these sources are in fact close binaries. Two out of the four MS stars also show significant infrared excesses typical of circumstellar disks. Extinctions and distances were determined for each MS star by fitting model stellar atmospheres to the SEDs. We estimate a distance to the cluster of between 455 and 535 pc, which agrees well with earlier (but far less precise) distance estimates. We conclude that the late-O star we identify is the dominant source of LyC luminosity needed to power the W40 HII region and is the likely source of the stellar wind that has blown a large (~4 pc) pinched-waist bubble observed in wide field mid-IR images. We also suggest that 3.6 cm radio emission observed from some of the sources in the cluster is likely not due to emission from ultra-compact HII regions, as suggested in other work, due to size constraints based on our derived distance to the cluster. Finally, we also present a discussion of the curious source IRS 3A, which has a very strong mid-IR excess (despite its B3 MS classification) and appears to be embedded in a dusty envelope roughly 2700 AU in size.Comment: Accepted for publication in The Astronomical Journal. 29 pages, 10 figure

    DNA Checkpoint and Repair Factors Are Nuclear Sensors for Intracellular Organelle Stresses-Inflammations and Cancers Can Have High Genomic Risks.

    Get PDF
    Under inflammatory conditions, inflammatory cells release reactive oxygen species (ROS) and reactive nitrogen species (RNS) which cause DNA damage. If not appropriately repaired, DNA damage leads to gene mutations and genomic instability. DNA damage checkpoint factors (DDCF) and DNA damage repair factors (DDRF) play a vital role in maintaining genomic integrity. However, how DDCFs and DDRFs are modulated under physiological and pathological conditions are not fully known. We took an experimental database analysis to determine the expression of 26 DNA D

    Ancilla-based quantum simulation

    Full text link
    We consider simulating the BCS Hamiltonian, a model of low temperature superconductivity, on a quantum computer. In particular we consider conducting the simulation on the qubus quantum computer, which uses a continuous variable ancilla to generate interactions between qubits. We demonstrate an O(N^3) improvement over previous work conducted on an NMR computer [PRL 89 057904 (2002) & PRL 97 050504 (2006)] for the nearest neighbour and completely general cases. We then go on to show methods to minimise the number of operations needed per time step using the qubus in three cases; a completely general case, a case of exponentially decaying interactions and the case of fixed range interactions. We make these results controlled on an ancilla qubit so that we can apply the phase estimation algorithm, and hence show that when N \geq 5, our qubus simulation requires significantly less operations that a similar simulation conducted on an NMR computer.Comment: 20 pages, 10 figures: V2 added section on phase estimation and performing controlled unitaries, V3 corrected minor typo

    Two-step membrane binding by the bacterial SRP receptor enable efficient and accurate Co-translational protein targeting

    Get PDF
    The signal recognition particle (SRP) delivers ~30% of the proteome to the eukaryotic endoplasmic reticulum, or the bacterial plasma membrane. The precise mechanism by which the bacterial SRP receptor, FtsY, interacts with and is regulated at the target membrane remain unclear. Here, quantitative analysis of FtsY-lipid interactions at single-molecule resolution revealed a two-step mechanism in which FtsY initially contacts membrane via a Dynamic mode, followed by an SRP-induced conformational transition to a Stable mode that activates FtsY for downstream steps. Importantly, mutational analyses revealed extensive auto-inhibitory mechanisms that prevent free FtsY from engaging membrane in the Stable mode; an engineered FtsY pre-organized into the Stable mode led to indiscriminate targeting in vitro and disrupted FtsY function in vivo. Our results show that the two-step lipid-binding mechanism uncouples the membrane association of FtsY from its conformational activation, thus optimizing the balance between the efficiency and fidelity of co-translational protein targeting

    Inhibiting the oncogenic translation program is an effective therapeutic strategy in multiple myeloma

    Full text link
    Published in final edited form as: Sci Transl Med. 2017 May 10; 9(389). https://doi.org/10.1126/scitranslmed.aal2668.Multiple myeloma (MM) is a frequently incurable hematological cancer in which overactivity of MYC plays a central role, notably through up-regulation of ribosome biogenesis and translation. To better understand the oncogenic program driven by MYC and investigate its potential as a therapeutic target, we screened a chemically diverse small-molecule library for anti-MM activity. The most potent hits identified were rocaglate scaffold inhibitors of translation initiation. Expression profiling of MM cells revealed reversion of the oncogenic MYC-driven transcriptional program by CMLD010509, the most promising rocaglate. Proteome-wide reversion correlated with selective depletion of short-lived proteins that are key to MM growth and survival, most notably MYC, MDM2, CCND1, MAF, and MCL-1. The efficacy of CMLD010509 in mouse models of MM confirmed the therapeutic relevance of these findings in vivo and supports the feasibility of targeting the oncogenic MYC-driven translation program in MM with rocaglates

    An overview of the Michigan Positron Microscope Program

    Full text link
    An overview of the Michigan Positron Microscope Program is presented with particular emphasis on the second generation microscope that is presently near completion. The design and intended applications of this microscope will be summarized.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/87602/2/391_1.pd
    • …
    corecore